68 research outputs found

    Robust Accelerating Control for Consistent Node Dynamics in a Platoon of CAVs

    Get PDF
    Driving as a platoon has potential to significantly benefit traffic capacity and safety. To generate more identical dynamics of nodes for a platoon of automated connected vehicles (CAVs), this chapter presents a robust acceleration controller using a multiple model control structure. The large uncertainties of node dynamics are divided into small ones using multiple uncertain models, and accordingly multiple robust controllers are designed. According to the errors between current node and multiple models, a scheduling logic is proposed, which automatically selects the most appropriate candidate controller into loop. Even under relatively large plant uncertainties, this method can offer consistent and approximately linear dynamics, which simplifies the synthesis of upper level platoon controller. This method is validated by comparative simulations with a sliding model controller and a fixed H∞ controller

    Feasible Policy Iteration

    Full text link
    Safe reinforcement learning (RL) aims to solve an optimal control problem under safety constraints. Existing direct\textit{direct} safe RL methods use the original constraint throughout the learning process. They either lack theoretical guarantees of the policy during iteration or suffer from infeasibility problems. To address this issue, we propose an indirect\textit{indirect} safe RL method called feasible policy iteration (FPI) that iteratively uses the feasible region of the last policy to constrain the current policy. The feasible region is represented by a feasibility function called constraint decay function (CDF). The core of FPI is a region-wise policy update rule called feasible policy improvement, which maximizes the return under the constraint of the CDF inside the feasible region and minimizes the CDF outside the feasible region. This update rule is always feasible and ensures that the feasible region monotonically expands and the state-value function monotonically increases inside the feasible region. Using the feasible Bellman equation, we prove that FPI converges to the maximum feasible region and the optimal state-value function. Experiments on classic control tasks and Safety Gym show that our algorithms achieve lower constraint violations and comparable or higher performance than the baselines

    Effect of Pulse‐and‐Glide Strategy on Traffic Flow for a Platoon of Mixed Automated and Manually Driven Vehicles

    Full text link
    The fuel consumption of ground vehicles is significantly affected by how they are driven. The fuel‐optimized vehicular automation technique can improve fuel economy for the host vehicle, but their effectiveness on a platoon of vehicles is still unknown. This article studies the performance of a well‐known fuel‐optimized vehicle automation strategy, i.e., Pulse‐and‐Glide (PnG) operation, on traffic smoothness and fuel economy in a mixed traffic flow. The mixed traffic flow is assumed to be a single‐lane highway on flat road consisting of both driverless and manually driven vehicles. The driverless vehicles are equipped with fuel economy‐oriented automated controller using the PnG strategy. The manually driven vehicles are simulated using the Intelligent Driver Models (IDM) to mimic the average car‐following behavior of human drivers in naturalistic traffics. A series of simulations are conducted with three scenarios, i.e., a single car, a car section, and a car platoon. The simulation results show that the PnG strategy can significantly improve the fuel economy of individual vehicles. For traffic flows, the fuel economy and traffic smoothness vary significantly under the PnG strategy.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/115907/1/mice12168.pd

    Stability and scalability of homogeneous vehicular platoon: study on the influence of information flow topologies

    Get PDF
    In addition to decentralized controllers, the information flow among vehicles can significantly affect the dynamics of a platoon. This paper studies the influence of information flow topology on the internal stability and scalability of homogeneous vehicular platoons moving in a rigid formation. A linearized vehicle longitudinal dynamic model is derived using the exact feedback linearization technique, which accommodates the inertial delay of powertrain dynamics. Directed graphs are adopted to describe different types of allowable information flow interconnecting vehicles, including both radar-based sensors and vehicle-to-vehicle (V2V) communications. Under linear feedback controllers, a unified internal stability theorem is proved by using the algebraic graph theory and Routh-Hurwitz stability criterion. The theorem explicitly establishes the stabilizing thresholds of linear controller gains for platoons, under a large class of different information flow topologies. Using matrix eigenvalue analysis, the scalability is investigated for platoons under two typical information flow topologies, i.e., 1) the stability margin of platoon decays to zero as 0(1/N2) for bidirectional topology; and 2) the stability margin is always bounded and independent of the platoon size for bidirectional-leader topology. Numerical simulations are used to illustrate the results

    Parallel Optimal Control for Cooperative Automation of Large-scale Connected Vehicles via ADMM

    Full text link
    This paper proposes a parallel optimization algorithm for cooperative automation of large-scale connected vehicles. The task of cooperative automation is formulated as a centralized optimization problem taking the whole decision space of all vehicles into account. Considering the uncertainty of the environment, the problem is solved in a receding horizon fashion. Then, we employ the alternating direction method of multipliers (ADMM) to solve the centralized optimization in a parallel way, which scales more favorably to large-scale instances. Also, Taylor series is used to linearize nonconvex constraints caused by coupling collision avoidance constraints among interactive vehicles. Simulations with two typical traffic scenes for multiple vehicles demonstrate the effectiveness and efficiency of our method

    Safe Reinforcement Learning with Dual Robustness

    Full text link
    Reinforcement learning (RL) agents are vulnerable to adversarial disturbances, which can deteriorate task performance or compromise safety specifications. Existing methods either address safety requirements under the assumption of no adversary (e.g., safe RL) or only focus on robustness against performance adversaries (e.g., robust RL). Learning one policy that is both safe and robust remains a challenging open problem. The difficulty is how to tackle two intertwined aspects in the worst cases: feasibility and optimality. Optimality is only valid inside a feasible region, while identification of maximal feasible region must rely on learning the optimal policy. To address this issue, we propose a systematic framework to unify safe RL and robust RL, including problem formulation, iteration scheme, convergence analysis and practical algorithm design. This unification is built upon constrained two-player zero-sum Markov games. A dual policy iteration scheme is proposed, which simultaneously optimizes a task policy and a safety policy. The convergence of this iteration scheme is proved. Furthermore, we design a deep RL algorithm for practical implementation, called dually robust actor-critic (DRAC). The evaluations with safety-critical benchmarks demonstrate that DRAC achieves high performance and persistent safety under all scenarios (no adversary, safety adversary, performance adversary), outperforming all baselines significantly
    • 

    corecore